Smart city energy efficient data privacy preservation protocol based on biometrics and fuzzy commitment scheme

Advancements in cloud computing, flying ad-hoc networks, wireless sensor networks, artificial intelligence, big data, 5th generation mobile network and internet of things have led to the development of smart cities. Owing to their massive interconnectedness, high volumes of data are collected and exchanged over the public internet. Therefore, the exchanged messages are susceptible to numerous security and privacy threats across these open public channels. Although many security techniques have been designed to address this issue, most of them are still vulnerable to attacks while some deploy computationally extensive cryptographic operations such as bilinear pairings and blockchain. In this paper, we leverage on biometrics, error correction codes and fuzzy commitment schemes to develop a secure and energy efficient authentication scheme for the smart cities. This is informed by the fact that biometric data is cumbersome to reproduce and hence attacks such as side-channeling are thwarted. We formally analyze the security of our protocol using the Burrows–Abadi–Needham logic logic, which shows that our scheme achieves strong mutual authentication among the communicating entities. The semantic analysis of our protocol shows that it mitigates attacks such as de-synchronization, eavesdropping, session hijacking, forgery and side-channeling. In addition, its formal security analysis demonstrates that it is secure under the Canetti and Krawczyk attack model. In terms of performance, our scheme is shown to reduce the computation overheads by 20.7% and hence is the most efficient among the state-of-the-art protocols.

• We leverage on biometrics, error correction codes and fuzzy commitment schemes to develop a secure and energy efficient authentication scheme for the smart cities. • Unlike majority of the current schemes that deploy timestamps to prevent replay attacks, our protocol incor- porates random nonces in all exchanged messages.This is demonstrated to address security issues such as de-synchronization attacks inherent in timestamp-based schemes.• We execute extensive formal security analysis using the BAN logic to show that our scheme performs strong mutual authentication and key negotiation in an appropriate manner.• Informal security analysis is carried out to demonstrate that the proposed protocol supports numerous functional and security features such as strong mutual authentication, anonymity and perfect key secrecy.In addition, this analysis shows that our scheme can withstand a myriad of smart city security threats such as session hijacking, privileged insider and side-channeling attacks.• Elaborate comparative evaluations are carried out to show that the proposed protocol incurs the lowest computation overheads and hence is energy efficient.
The rest of this paper is structured as follows: "Related work" section discusses related works while "The proposed protocol" section presents the proposed protocol.On the other hand, "Security analysis" section discusses the security analysis of our scheme while "Performance evaluation" section describes its performance evaluation.Towards the end of this paper, "Conclusion and future work" section presents the conclusion and future research work.

Mathematical preliminaries
In this section, we provide some mathematical formulations for the key cryptographic building blocks of the proposed scheme.This include fuzzy commitment, one way hashing and error correcting codes.
www.nature.com/scientificreports/One way hashing Suppose that N is a set of all positive integers, P k is a family of uniform probability distributions and ℒ is a polynomial such that ℒ (k) > k.Then, H represents a family of functions which are defined by H = P k H k , where H k is a multi-set of functions from L(k) to k .Here, P k (x) = 1/2 L(k) for all x ∈ L(k) .H is referred to as a hash function, which compresses ℒ (k)-bit input into some k-bit output strings.
Definition 1 Let us consider two strings a, b ∈ L(k) , where a = b .We say that string a collides with string b under h ∈ H k , or (a, b) is a collision pair for h, provided that h (a) = h (b).
Definition 2 H is regarded as polynomial time computable on condition that there exists a polynomial (in k) time algorithm that derives all h ∈ H. Definition 3 H is regarded as accessible provided that there exists a probabilistic time algorithm which takes input k ∈ N and outputs homogeneously at random a depiction of h ∈ H k .

Error correcting codes
In noisy transmission channels, error correcting code (ecc) is crucial for accurate reception of the transmitted data.Particularly, error correcting codes are critical in fuzzy commitment systems where they ensure that data is exchanged accurately over noisy transmission channels.Suppose that Ψ is a set of messages, where Ψ = {0,1} φ .Then, an error correcting code is made up of a set of codephrases CP ⊆ {0,1} ρ .A typical ecc comprises of a trans- lation function ω and decoding function f, where ω: Ψ → CP and f: {0,1} ρ → CP ∪ {γ}.Denoting the Hamming distance as H, then the decoding function maps a ρ-bit string S to the closest codephrase in CP in terms of H, otherwise it outputs γ.Prior to transmission, any message ψ ∈ is mapped to an element in CP.For improved redundancy, ρ > ϕ .Suppose that θ is the correction threshold, and τ ∈{0,1} ρ is the error term.Then, for code- phrase cp ∈ CP and Hamming weight ||τ||≤ θ, we have f (cp ⊕ τ) = cp.

Fuzzy commitment
Due to the noisy nature of biometric data, the input biometrics is not exactly similar to the biometric templates.Therefore, the biometric template can be deployed in fuzzy commitment schemes.Suppose that h: {0,1} ρ → {0,1} χ is a collision-resistant one-way hashing function.We also let w be the witness, λ = h(cp) and ε = w ⊕ cp.Then, the fuzzy commitment scheme F: ({0,1} ρ , {0,1} ρ ) → ({0,1} χ , {0,1} ρ ) commits codephrase cp ∈ CP using a ρ -bit witness w as F (cp, w) = (λ, ε).Provided that witness w * is fairly close to w but not necessarily equivalent to w, then commitment F (cp, w) = (λ, ε) can be opened using w * .Suppose that this commitment is sent from T towards R. Therefore, the opening of this commitment at R using w * involves the derivation of cp * = f (w * ⊕ ε).Since ε = w ⊕ cp, then cp * can also be expressed as cp * = f (cp ⊕ (w * ⊕ w)).Thereafter, R confirms whether λ ≟ h (cp * ).Provided that this condition holds, then the fuzzy commitment is effectively opened.Otherwise, witness w * is flagged as invalid.We apply this fuzzy commitment concept in our biometric authentication procedures by treating the biometric template as witness w.As such, the user inputs biometric data (seen as witness w * ) which is deployed to open codephrase cp, provided that w * is closer to w.

Attack model
In the proposed scheme, the adversary is assumed to have all the capabilities in the Canetti and Krawczyk (CK) threat model.Therefore, the communication process within the smart city is executed over the public internet and hence the attacker can have full control of this channel.In addition, the attacker can eavesdrop, alter, delete and insert bogus messages in the communication channel during message exchanges over the public smart city wireless channels.Moreover, all the sensitive data stored in the sensor nodes can be extracted upon physical capture of these nodes.It is also possible for all secret information, ephemeral secrets and session states to be compromised via session-hijacking attacks.

Related work
Many security techniques have been developed over the recent past to offer security protection in IoT and other devices interconnected in smart cities [27][28][29][30][31] .However, these schemes have extensive communication and computation overheads 32 .Although the protocol in 33 is lightweight and hence can address this issue, it cannot withstand outsider attackers 34 .Blockchain technology 35 can provide authentication and decentralized management of identity as well as authorization policies.Therefore, many blockchain-based security schemes have been presented in [36][37][38][39][40][41][42][43] .However, these schemes incur high storage and computation overheads which are not suitable for the sensors 44 .Therefore, a lightweight authentication scheme is developed in 3 .However, the communication costs analysis of this scheme is missing.In addition, it has not been evaluated against attacks such as side-channeling and de-synchronization.
Based on the Physically Unclonable Function (PUF), mutual authentication schemes are presented in 4,45,46 .Although these protocols can withstand physical capture and side-channeling attacks, PUF-based schemes have stability challenges 47 .On the other hand, biometric-based schemes have been introduced in [48][49][50][51] .However, the three-factor authentication protocol in 48 cannot preserve perfect backward secrecy 52 .Therefore, an improved scheme is presented in 52 .Unfortunately, this protocol is susceptible to offline password guessing, forgery, session key disclosure and replay attacks 49 .In addition, it cannot uphold perfect forward secrecy and data confidentiality.On the other hand, the protocol in 50 is vulnerable to impersonation and stolen verifier attacks 51 .In addition, it fails to preserve user untraceability.To prevent single-point of failure attacks, a scheme that is devoid of trusted issuer is developed in 53 .However, comparative security and performance analyses of this scheme have not been carried out.Similarly, feasibility, scalability and comparative analyses against the state of the art techniques are missing in 54 .
To mitigate service-oriented attacks in smart cities, a context-based trust model is presented in 55 .However, processing huge volumes of contextual data results in high computation overhead 56 .Similarly, the quantum-inspired technique presented in 57 incurs extensive computation overheads due to the required quantum computing 58 .Although an energy-efficient framework for IoT developed in 59 can address this issue, its comparative performance and security analyses have not be carried out.The verification scheme in 60 is efficient and hence can address the performance issues in 55,57 .However, it fails to provide robust identity check and user anonymity 61 .Similarly, the Elliptic Curve Cryptography (ECC) based protocol in 61 cannot offer anonymity and untraceability.Therefore, an ECC based anonymous authentication protocol is introduced in 13 , while an identity based technique is presented in 62 to offer strong unforgeability and anonymity.Although the scheme in 13 is shown to resist DoS attacks, its numerous point multiplications can lead to high computation costs.Similarly, the fuzzy extractor based protocol in 63 incurs heavy computation overheads 32 .On the other hand, identity-based schemes have key escrow problems 64 .
To protect smart cities against botnet attacks, an algorithm based on Long Short-Term Memory (LSTM) is developed in 65 .However, its evaluation is carried out on a single dataset of botnet attacks and hence fails to reflect a variety of attack vectors in a typical smart city.In addition, its performance evaluation in terms of the required resources has not been presented.To ensure access control and high security level, Public Key Cryptography (PKC) based protocols have been developed in [66][67][68] .However, these schemes are susceptible to physical capture attacks and hence their stored secret credentials can be retrieved 4 .Thereafter, the attackers are able to impersonate the entities whose credentials have been extracted.In addition, most of these PKC-based schemes incur extensive communication and computation overheads 69 .Moreover, the homomorphic encryption based protocol in 66 is vulnerable to privileged insider and session key disclosure attacks 4 .On its part, the bilinear pairing based protocol in 67 fails to offer perfect forward secrecy and cannot withstand impersonation attacks 68 .In addition, the deployed bilinear pairing operations incur extensive communication and computation overheads and hence cannot support real-time services provision in smart cities. Regarding the ECC-based developed in 68 , it is susceptible to impersonation, replay and privileged insider attacks 70 .In addition, it cannot offer strong mutual authentication among the communicating entities.Therefore, an improved security technique is presented in 70 .However, this protocol is vulnerable to attacks such as server spoofing, session key disclosure and forgery 4 .Although the schemes in 71,72 can solve some of these challenges, they have not been evaluated against de-synchronization attacks.On their part, the three-factor security schemes in [48][49][50][51][52] are susceptible to potential security attacks 4 .Although the protocol in 73 addresses some of the attacks such as ephemeral leakage, it cannot withstand identity guessing attacks [74][75][76] .
Based on the discussion above, it is evident that many schemes have been developed for the smart city environment.However, the attainment of perfect smart city security at low computation and communication is still an open challenge.For instance, many security protocols have been shown to be vulnerable to numerous attacks while others cannot support anonymity, mutual authentication and untraceability.In addition, some of these schemes do not incorporate biometric and password change procedures.Moreover, some of these security techniques incur extensive computation and communication overheads while others deploy centralized architecture which can easily result in central failure, denial of services and privacy breaches 39 .The proposed protocol is demonstrated to address some of these security, performance and privacy challenges.For instance, our scheme incurs the lowest computation overheads among its peers and hence addresses performance challenges in most of the above protocols.In addition, it provides support for anonymity, mutual authentication and untraceability which are features missing in most of the above schemes.Moreover, it mitigates attacks which are rarely considered in most of the existing protocols.Such attacks include de-synchronization, eavesdropping, session hijacking, forgery and side-channeling.

The proposed protocol
The elliptic curve cryptography offer offers strong security at relatively shorter key sizes compared to other public key cryptographies such as RSA.Therefore, we deploy elliptic curve cryptography in the proposed scheme.To address physical and side-channeling attacks, we leverage on biometric, error correction codes and fuzzy commitment schemes.

Motivation
Smart cities have streamlined services in urban centers, leading to the enhancement on the quality of life of the citizens.In a typical smart city, numerous smart devices are interconnected to facilitate activities such as surveillance, shipping, logistics, healthcare and warehousing.As such, high volumes of data are generated and exchanged among these smart devices.Since these message exchanges are carried out over the public internet, many security and privacy threats lurk in this environment.For instance, personal user information can be eavesdropped over the public channels while successful sensor and device capture can facilitate impersonation attacks.Therefore, past research works have presented numerous security techniques to alleviate these challenges.Unfortunately, majority of these schemes are based on computationally extensive cryptographic operations such as bilinear pairings.Consequently, these schemes are inefficient for the computation, bandwidth, storage and energy constrained sensor nodes.In addition, some of the presented security solutions still have security and privacy related issues 77,78 such as susceptibility to physical, impersonation, privileged insider and

System setup
This phase is carried out by the gateway node GW k .The goal is to derive the long term keys that will be utilized in the latter phases of our scheme.The following 3 steps are executed during the system setup phase.
Step 1 The GW k selects some elliptic curve E and additive group G over finite field F p .Here, the generator is point P whose order is a large prime number q.
Step 2 GW k generates nonce n ∈ Z * q and sets it as its secret key.Next, it derives its corresponding public key as P k = nP.
Step 3 The GW k selects M k as its master key and privately keeps both n and M k .Finally, it publishes parameter set {P, P k , G, E (F p )}.

Sensor node registration
Prior to actual deployment in their application domains, each sensor node SN j must be registered at the gateway node GW k .The aim is to assign these sensors some security values that are deployed during the login, authentication and key negotiation phase.The following 2 steps are executed in this phase.
Step 1 The GW k chooses SNID j as sensor node SN j unique identity.This is followed by the derivation of private key K GS = h (SNID j ||M k ).GW k sends values SNID j and K GS to SN j over secure channels as shown in Fig. 2.
Step 2 Upon receiving parameters SNID j and K GS from the GW k , the SN j stores them in its memory.The sensor node is now ready to be deployed to the field.

User registration
All users within the smart city network must be registered at their respective gateway nodes.During this phase, the users are assigned security tokens that they will deploy to securely acquire data from the sensor devices deployed in a given domain.The following 4 steps are executed during this process.
Step 1 The user U i through the MD i generates unique identity UID i and password PW i .Next, nonce R a is generated which is then used to derive value A 1 = h (PW i ||R a ).
Step 2 The U i imprints biometric data β i onto the MD i .Finally, registration request Req = {UID i , A 1 , β i } is constructed and forwarded to the GW k over secure channels as shown in Fig. 2.
Step 3 Upon receiving registration request Req from U i , the GW k selects some random codephrase CP i ∈ CP for this particular user U i .Next, it derives tokens λ = h . Finally, it stores UID i in its database before composing registration response Res = {f (.), λ, ε, A 2 , A 3 , P k } that is sent to the U i over secured channels.
Step 4 After getting registration response Res from the GW k , the U i through MD i stores value set {f (.), λ, ε, A 2 , A 3 , P k , R a } in its memory.

Login, authentication and key negotiation
This phase is activated whenever the user U i through the MD i wants some access to the data help by the sensors.
Here, the security tokens assigned during the registration phase are deployed to authenticate U i to the gateway node GW k .To accomplish this, the following 8 steps are executed.
Step 1 User U i imprints his/her biometric data β i * onto the MD i upon which value ).Thereafter, the MD i checks whether h (CP i * ) ≟ λ = h (CP i ).Basically, the user login session is terminated upon verification failure.Otherwise, U i has passed the biometric validation and hence proceeds to input unique identity UID i and password PW i into the MD i .Nonce Verification Rule (NVR): S|≡#(A),S|≡T|∼A S|≡T|≡A To be secure under the BAN logic, the proposed scheme must satisfy the following security goals.Goal 1: ↔ SN j In our scheme, 4 messages are exchanged during the login, authentication and key agreement phase.These messages include For ease of analysis, we transform these messages into idealized format as follows.
The following initial state assumptions (SA) are also made.SA 1 : Based on the above BAN logic rules, idealized format of the exchanged messages and the initial state assumptions, we proof that the proposed scheme attains all the above security goals through the following BAN logic proof (BℒP).
Using the idealized form of ℒog Req and BR, we obtain BℒP 1 , } Using FPR and NVR on both BℒP 2 and SA 1 yields BℒP 3 as shown below.
} On the other hand, using JR on BℒP 3 , SA 6 and SA 12 yields BℒP 4 .
↔ MD i , hence security Goal 5 is attained.On the other hand, NVR is applied to both BℒP 5 and SA 12 to yield BℒP 6 .
↔ MD i , achieving security Goal 6. Considering idealized formats of both Auth 1 and Auth 3 , the application of BR yields BℒP 7 and BℒP 8 .
s ,SK G ) } Using the MMR on both BℒP 7 and SA 9 results in BℒP 9 .BℒP 9 : ) } However, the application of MMR on both BℒP 8 and SA 4 yields BℒP 10 .BℒP 10 : ↔ SN j and hence Goal 8 is attained.The attainment of all the 8 formulated security goals demonstrates that the proposed scheme achieves strong mutual authentication among the SN j , MD i and GW k .In addition, it confirms that after successful mutual authentication, session key SK D = SK G = SK S is established among these three entities.

Informal security analysis
In this sub-section, we state and proof various propositions to show that our scheme supports numerous security features and is robust against many typical smart city attacks.Based on the attack model in "Attack model" section, an adversary is capable of launching attacks such as de-synchronization, denial of service, eavesdropping, session hijacking, KSSTI, replays, forgery, MitM, privileged insider,physical, side-channeling and impersonation.In this sub-section, we demonstrate that our protocol mitigates all these attacks.

Proposition 1 Eavesdropping attacks are prevented.
Proof Suppose that an adversary Å is interested in intercepting the exchanged messages after which parameters such as SNID j and UID i are retrieved.In our scheme, messages ℒog

Computation costs
The proposed scheme is implemented in a laptop with the specifications in Table 2. Using the specifications in Table 2, the execution time times for the the elliptic curve point multiplication (T EM ) ≈ 21.74 ms, one-way hashing (T H ) ≈ 0.63 ms and elliptic curve point addition (T EA ) ≈ 6.75 ms.
During the login, authentication and key negotiation phase, the MD i executes 2 ECC point multiplications and 8 one-way hashing operations.On the other hand, the GW k carries out a single ECC point multiplication and 9 one-way hashing operations.On its part, the SN j executes only 4 one-way hashing operations.Therefore, the total computation cost of our scheme is 21T H + 3 T EM .Table 3 presents the computation costs comparative evaluation of our scheme against other related schemes.
As shown in Fig. 4, the scheme developed in 71 incurs the highest computation costs of 251.33 ms.This is attributed to the numerous elliptic curve point multiplications which are computationally intensive.This is www.nature.com/scientificreports/Some of the anticipated limitations that are likely to crop up during the practical implementation of our scheme is its slightly high communication costs and the need for biometric reader at the user mobile device MD i .Specifically, the accurate recovery of biometric tokens via fuzzy extraction is not a trivial exercise.

Conclusion and future work
The security, privacy and performance issues in smart cities have attracted a lot of attention from the industry and academia.Therefore, past research works have developed a myriad of security solutions for this environment.In majority of these approaches, public key cryptography, blockchain and bilinear pairing operations are utilized.As such, the resulting authentication process is computationally extensive and hence long latencies can be experienced.In addition, they place high communication, energy and storage overheads on the resourcelimited smart city sensor devices.Motivated by this, we have presented a biometric-based scheme that has been demonstrated to incur the least computation overheads.Its formal security analysis has shown that it performs strong mutual authentication and key negotiation in an appropriate manner.In addition, informal security analysis has shown that it is secure under all the threat assumptions in the Canetti and Krawczyk attack model.Future research work will involve further reductions in the communication overheads which are observed to be slightly higher compared with some of its peers.

Figure 2 .
Figure 2. System setup and registration.